11. Programming Bayes Rule

Flip One of Two

In [10]:
# just a helper function for easier youtube call
def strip_url(url):
    return url.replace('https://youtu.be/','')

from IPython.display import YouTubeVideo
url = 'https://youtu.be/bQ7XIF23j-w'
YouTubeVideo(strip_url(url))
Out[10]:

Visualizing the given problem..

In [129]:
def format_equation(LHS, RHS):
    """
    returns formatted string with RHS bold, and blue
    """
    return '< {0} = <FONT COLOR="blue"><B>{1}</B></FONT> >'.format(LHS, RHS)

def draw_graph(highlight_nodes=[]):  

    from graphviz import Digraph
    g = Digraph('Work Flow')
    g.attr(rankdir='LR', ranksep='1', nodesep='0.85')
    g.attr('node', shape='circle', fontsize='10')
    g.attr('edge', fontsize='10')

    g.node('Root','R')
    g.node('C1')
    g.node('C2')
    g.node('H1','H')
    g.node('T1','T')
    g.node('H2','H')
    g.node('T2','T')

    g.edge('Root','C1',label=format_equation('p(C1)','p0'))
    g.edge('Root','C2',label=format_equation('p(C2)','1-p0'))
    g.edge('C1','H1',label=format_equation('p(H | C1)', 'p1'))
    g.edge('C1','T1',label=format_equation('p(T | C1)', '1-p1'))
    g.edge('C2','H2',label=format_equation('p(H | C2)', 'p2'))
    g.edge('C2','T2',label=format_equation('p(T | C2)', '1-p2'))

    for each_node in highlight_nodes:
        #print(each_node)
        g.node(each_node,style='filled',fillcolor='yellow')

    return g

draw_graph()
Out[129]:
Work Flow Root R C1 C1 Root->C1 p(C1) = p0 C2 C2 Root->C2 p(C2) = 1-p0 H1 H C1->H1 p(H | C1) = p1 T1 T C1->T1 p(T | C1) = 1-p1 H2 H C2->H2 p(H | C2) = p2 T2 T C2->T2 p(T | C2) = 1-p2

What is p(H)?

$$ \textstyle \begin{array}{l} p(H)\ = p(C1)p(H\ |\ C1) + p(C2)p(H\ | C2) \\ = p0\ .\ p1 + (1 - p0)\ .\ p2 \\ = (0.3)\ (0.5) + (1-0.3)\ (0.9) \\ = 0.78 \end{array} $$

In [131]:
highlight_nodes = ['Root','C1','H1','C2','H2']
draw_graph(highlight_nodes=highlight_nodes)
Out[131]:
Work Flow Root R C1 C1 Root->C1 p(C1) = p0 C2 C2 Root->C2 p(C2) = 1-p0 H1 H C1->H1 p(H | C1) = p1 T1 T C1->T1 p(T | C1) = 1-p1 H2 H C2->H2 p(H | C2) = p2 T2 T C2->T2 p(T | C2) = 1-p2

Cancer Example 1

In [138]:
def draw_cancer_graph(highlight_nodes=[]):

    from graphviz import Digraph
    g = Digraph('Work Flow')
    g.attr(rankdir='LR', ranksep='1', nodesep='0.85')
    g.attr('node', shape='circle', fontsize='10')
    g.attr('edge', fontsize='10')

    g.node('Root','R')
    g.node('C')
    g.node('NC')
    g.node('PosC','Pos')
    g.node('NegC','Neg')
    g.node('PosNC','Pos')
    g.node('NegNC','Neg')

    g.edge('Root','C',label=format_equation('p(C)','p0'))
    g.edge('Root','NC',label=format_equation('p(NC)','1-p0'))
    g.edge('C','PosC',label=format_equation('p(Pos | C)', 'p1'))
    g.edge('C','NegC',label=format_equation('p(Neg | C)', '1-p1'))
    g.edge('NC','PosNC',label=format_equation('p(Pos | NC)', '1-p2'))
    g.edge('NC','NegNC',label=format_equation('p(Neg | NC)', 'p2'))


    for each_node in highlight_nodes:
        #print(each_node)
        g.node(each_node,style='filled',fillcolor='yellow')

    return g

draw_cancer_graph()
Out[138]:
Work Flow Root R C C Root->C p(C) = p0 NC NC Root->NC p(NC) = 1-p0 PosC Pos C->PosC p(Pos | C) = p1 NegC Neg C->NegC p(Neg | C) = 1-p1 PosNC Pos NC->PosNC p(Pos | NC) = 1-p2 NegNC Neg NC->NegNC p(Neg | NC) = p2

What is p(Pos)?

$$ \textstyle \begin{array}{l} p0 = 0.1 \\ p1 = 0.9 \\ p2 = 0.8 \\ \end{array} $$

Note the change in p2 now assigned to case (Neg|NC) which is different from previous example

In [139]:
highlight_nodes = ['Root','C','PosC','NC','PosNC']
draw_cancer_graph(highlight_nodes=highlight_nodes)
Out[139]:
Work Flow Root R C C Root->C p(C) = p0 NC NC Root->NC p(NC) = 1-p0 PosC Pos C->PosC p(Pos | C) = p1 NegC Neg C->NegC p(Neg | C) = 1-p1 PosNC Pos NC->PosNC p(Pos | NC) = 1-p2 NegNC Neg NC->NegNC p(Neg | NC) = p2

$$ \textstyle \begin{array}{l} p(Pos)\ = p(C)p(Pos\ |\ C) + p(NC)p(Pos\ | NC) \\ = p0\ .\ p1 + (1 - p0)\ .\ (1-p2) \\ = (0.1)\ (0.9) + (1-0.1)\ (1-0.8) \\ = 0.27 \end{array} $$

What is p( C | Pos )?

$$ \textstyle \begin{array}{l} p(C\ | Pos)\\ = \ \dfrac {p(C)p(Pos\ |\ C)}{p(C)p(Pos\ |\ C) + p(NC)p(Pos\ |\ NC)} \\ \\ = \ \dfrac {p0.p1}{p0.p1 + (1-p0)(1-p2)} \\ \\ = \ \dfrac {(0.1)(0.9)}{0.27} \\ \\ = \ 0.33 \end{array} $$

What is p( C | Neg)

In [140]:
highlight_nodes = ['Root','C','NegC','NC','NegNC']
draw_cancer_graph(highlight_nodes=highlight_nodes)
Out[140]:
Work Flow Root R C C Root->C p(C) = p0 NC NC Root->NC p(NC) = 1-p0 PosC Pos C->PosC p(Pos | C) = p1 NegC Neg C->NegC p(Neg | C) = 1-p1 PosNC Pos NC->PosNC p(Pos | NC) = 1-p2 NegNC Neg NC->NegNC p(Neg | NC) = p2

Check Density

Piecewise Continous Distribution

In [1]:
# just a helper function for easier youtube call
def strip_url(url):
    return url.replace('https://youtu.be/','')

from IPython.display import YouTubeVideo
url = 'https://youtu.be/v30w4s2djfc'
YouTubeVideo(strip_url(url))
Out[1]:

Answer

$$ \textstyle \begin{array}{l} f(x) = \begin{cases} a & \text {if } x \leq \text {noon} \\ b & \text {if } x > \text {noon} \\ \end{cases} \\ \\ \text {x varies from 0 to 24} \\ \text {a = 2b} \\ \\ 12a + 12b = 1 \text { because of pd}\\ 12(2b) + 12b = 1 \\ 24b + 12b = 1 \\ 36b = 1 \\ \\ \therefore \ b = \dfrac {1}{36} = 0.0277 \\ \\ \ \ \ \ \ a = 2b = 0.0555 \end{array} $$

Calculate Density

In [2]:
url = 'https://youtu.be/0gcTEExMH3k'
YouTubeVideo(strip_url(url))
Out[2]:

Ans

$$ \textstyle \begin{array}{l} f(x) = \begin{cases} a & 3 \leq x \leq 3\frac{1}{2} \\ \\ 0 & otherwise \end{cases} \\ \\ a(3\frac{1}{2} - 3) = 1 \\ \\ a(\frac{1}{2}) = 1 \\ \\ \therefore \ a = 2 \end{array} $$

Note $a > 2$. This is possible because these are density functions, so altitude could be $>$ 1, but only that, total area becomes 1 always.

For a function to be qualified as a probability density  
1. Need not be Positive (because it can be zero also at places, No, Zero is not positive)  
2. **Has to be Non Negative** (0 or Positive)  
3. Need not be continous (we just saw piece wise continous above)  
4. Need not be less than or equal to 1 (we just saw amplitude can be > 1 to meet total area = 1) 

$$ \textstyle \begin{array}{l} p(C\ | Neg)\\ \\ = \ \dfrac {p(C)p(Neg|C)}{p(C)p(Neg|C) + p(NC)p(Neg | NC)}\\ \\ = \ \dfrac {(p0)(1-p1)}{(p0)(1-p1)\ +\ (1-p0)(p2) }\\ \\ = \ \dfrac {(0.1)(1-0.9)}{(0.1)(1-0.9)+(1-0.1)(0.8)}\\ \\ = \ 0.0136 \end{array} $$